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1. Introduction 

Fermionic operators (D) satisfying the Ginsparg- Wilson relation [jl| 

l5 D + D l5 = — D l5 D (1.1) 
m 

allow to solve the chirality problem of four-dimensional QCD at non-vanishing lattice spac- 
ing j| — |§(for different reviews at recent lattice conferences see e.g. ]l0| — []l8| ). 

Clearly, it would be very useful to exploit these new developments in numerical studies 
of QCD. In the past years several groups have calculated already e.g. the quenched hadron 
spectrum and light quark masses with better and better accuracies (for recent results see 
the overlap formulation []l9| — [^2| and a related work with the perfect and chirally improved 
actions p3|] ). 

Due to limitations in computational resources no result is available for dynamical, 
four-dimensional QCD with Ginsparg- Wilson fermions. Some exploratory studies were 
carried out in the Schwinger model and suggestions were made, which could help the four- 



dimensional full QCD case ||— J3J. 

In this letter we present exploratory tests using dynamical, four-dimensional QCD 
with Ginsparg- Wilson fermions. We start with the Zolotarev optimal rational polynomial 
approximation |3^]. The partial fraction expansion of the rational polynoms leads to a 
particularly simple expression for the fermionic force of the hybrid Monte-Carlo. In addition 
to the usual fermion matrix inversion we have another inversion due to the inverse in 
the partional fraction expansion. These nested inversions are very CPU consuming. By 
projecting out the smallest eigenmodes in the inner loop a significant speed up could be 
reached. 

As we emphasized, our results are exploratory. In addition they are obtained on 
absurdly small lattices. As usual, direct physical interpretation will be possible only after 
studying larger lattices and approaching the continuum limit. Nevertheless, these first 
results can be used as references in order to cross check future studies. 
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2. Hybrid Monte-Carlo for QCD with overlap fermions 



First we fix our notations. The massless Neuberger-Dirac operator (or overlap operator) 
D can be written as 

D = m [l+ l5 sgn(H w )}, (2.1) 

This D operator satisfies eq. (|1.1|). Hw is the hermitian Dirac operator, Hw = "J^Dw, 
which is built from the massive Wilson-Dirac operator, Dw, defined by 

[Dw] xy — $xy I^W 

m 

where kw is related to mo by k^} = 8 — mo/2. The mass is introduced in the overlap 
operator by 

D(m) = (1 - m)D + m. (2.2) 



In the sign function of eq. (fEll) one uses sgn.(Hw) = Hw/yH w . ^he n th order 
Zolotarev optimal rational approximation for \j^fx in some interval [l,x max ] can be ex- 
pressed by elliptic functions (see e.g. f3j|). For most of the purposes a more transparent 
suboptimal choice e(x) with e(l) = 1 is sufficient 1 

s g n W ^ tW ^,n ,f; +02 '!^t C2 ' ) v < 2 ' 3 > 

^ (a^ + C2J_l)/(l + C 2 l-l) 

where 

sn 2 (/K/(2n + 1);k) , ; , , 

C ' = l- S J (i A7(2n + !),«) ■ -vT^T^, (2.4) 

the Jacobian elliptic function sn(u, k) = rj is defined by the elliptic integral 

n(r/)= - ^ (2.5) 

w; 7o ^(l-t 2 )(l-/^ 2 ) v 7 



and X = is the complete elliptic integral. A particularly useful form of eq. (|2.3| ) and 
an approximation for the sign function is given by partional fractioning 

e(x) = x(x 2 + c 2n ) V - T — ^ , (2.6) 

^ ^ + C 2 /-l 

where the 6; parameters are expressed by the q coefficients of eq. (|2,4|). In the rest of the 
paper we use the approximate e(x) instead of the sign function in the Dirac operator 

D = m [l + l5 e(h w )], (2.7) 

where we normalize Hw by its smallest eigenvalue: hw = Hw /\^mm\- This choice ensures 
that all the eigenvalues of h w are within the interval [l,x max ] where x max is taken to be 
larger than the condition number of Hw- 

1 In our test we use a 20th order approximation with e.g. a; ma x=10 11 . It is easy to check that this choice 
corresponds to a relative accuracy of 0(1O -5 ). The difference between the optimal and suboptimal choices 
is one order of magnitude smaller. 



- 2 - 



We follow the standard procedure to implement the fermionic operator of eq. ( |2.1| ) 
into a hybrid Monte-Carlo |34[] QCD algorithm. We analyze two flavours, thus D^D is used 
as the Dirac operator. The fermionic determinant for these two flavours can be given by 
introducing the pseudofermionic fields 

det(D^D) = J d^d<pexp(-S p ) with S p = <ft (D* D)~ x 4>. (2.8) 

As usual, the integral is calculated stochastically by generating Gauss distributed <f> pseu- 
dofermions. The contribution of the pseudofermions to the force has the usual form 

gE = -^t^^ wi th ^ = {rtD)-% (2.9) 

Compared to the hybrid Monte-Carlo with Wilson fermions the new feature is the 
somewhat more complicated structure of the overlap operator. This complication is three- 
fold. 

a. First of all, it appears in the inversion of the fermion operator. 

b. Secondly, the complication is present in the structure of the derivative term in the 
fermionic force. 

c. Thirdly, the fermion force has Dirac-delta type singularities. 

ad a. The inversion of the fermion operator if) = (D^D) -1 ^ is done by n Q conjugate 
gradient steps ("outer inversion"). Note, however, that each step in this procedure needs 
the calculation of (D'D)(f>. The operator D contains e(hyy), which is given by the partional 
fraction expansion (see eq. |2.6| ), Thus, at each "outer" conjugate gradient step one needs 
n different inversions. Fortunately, these inversions differ only by a constant term c^i-i 
(I = 1, n). It means, that this "inner inversion" can be done by one multi-shift conjugate 
gradient [^7j procedure in rtj steps, and one is not forced to carry out n different conjugate 
gradient inversions. This nested conjugate gradient procedure needs all together n Q ■ m 
matrix-vector multiplications. It is already well known from the quenched analysis, that 
the number of steps in the inner inversion can be significantly reduced by projecting out the 
smallest eigenmodes and performing the conjugate gradient steps only in the orthogonal 
subspace. 

ad b. In the fermionic force the derivative with respect to the link variable U can 
be straightforwardly calculated from the partial fraction expansion eq. Q2.6|) . The term 
coming from the e function reads 



i=i 



+(c2j_i - c 2n ) , 2 kw (h w ^f + ^f-hw) } hi>h (2-10) 

r% + c 2 i-\ dU dU 



where the definition 



V>2 = (/4 + C2Z-l)~V (2.H) 

was introduced. In order to calculate the force one has to determine ipi. The above inversion 
for the force is done by a multi-shift conjugate gradient process in additional rif steps. Note, 
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When 


New momenta 


Refraction 


(N, H) 2 > 2AS 


H - N{N, H) + Ny/(N, H) 2 - 2AS 


Reflection 


(N, H) 2 < 2AS 


H-2N{N, H) 



Table 1: Refraction and reflection can happen to the system when approaches a zero-eigenvalue 
surface of Hyy. The conditions and the new momenta are indicated. H is the momentum before 
the refraction/reflection. 



however, that this inversion increases the computational effort only marginally. Having 
obtained ip in (n -rtj) multiplication steps, tpi can be obtained just by inverting hyy + C2i-\- 
All together the determination of ipi needs (n -nj+nj) matrix- vector multiplications. 

ad c. Since the fermionic force is the derivative of a non-analytic function, we expect 
non-trivial behaviour near these non-analyticities. This feature is already present in a 
classical one-dimensional motion of a point-particle in a step function potential. A finite 
stepsize integration of the equation of motion will not notice the step in the potential or the 
Dirac-delta in the force. As a consequence the action has a large change which might lead 
to bad acceptance rate in the Monte-Carlo simulations. One can improve on this situation. 
During the integration one should check whether the particle moved from one side to the 
other one of the step function. If it is necessary, one corrects its momentum and position. 
This correction has to be done also in the case of the overlap fermion. The microcanonical 
energy, 

H = ±(H, H) + S„[U] + S P [U, 4>] = \{H, H) + 5(17, </>] (2.12) 

has a step function type non-analyticity on the the zero-eigenvalue surfaces of the H\y 
operator in the space of link variables coming from the pseudofermion action. In eqn. ( 2.12j ) 



the (A, B) = — tr(A Xt/1 B x ^) scalar product and H anti-hermitian gauge momenta 
were introduced. When the microcanonical trajectory reaches one of these surfaces, we 
expect either reflection or refraction. If the momentum component, orthogonal to the zero- 
eigenvalue surface, is large enough to compensate the change of the action between the 
two sides of the singularity (AS) then refraction should happen, otherwise the trajectory 
should reflect off the singularity surface. Other components of the momenta are unaffected. 
The anti-hermitian normal vector (iV) of the zero-eigenvalue surface can be expressed with 
the help of the gauge derivative (in our shorthand notation DX) as 



A=0 



where DX = (X\DHyv\X). Table [l] summarizes the conditions of refraction and reflection 
and the new momenta. 

We have to modify the standard leap-frog integration of the equations of motion in 
order to take into account reflection and refraction. This can be done in the following 
way. The standard leap-frog consists of three steps: an update of the links with stepsize 
r/2, an update of the momenta with r and finally another update of the links, using the 
new momenta, again with r/2, where r is the stepsize of the integration. The system can 
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only reach the zero-eigenvalue surface during the update of the links. We have to identify 
the step in which this happens. This can be done by predicting the change of the lowest 
eigenvalues during one elementary link update using the derivatives -DA. At the stepsizes 
we used, this procedure turned out to be very reliable. After identifying the step in which 
the zero eigenvalue surface is reached, we have to replace it with the following three steps: 

1. Update the links with t\, so that we reach exactly the zero-eigenvalue surface. t\ can 
be determined with the help of DX. 

2. Modify the momenta according to Table [I]. 

3. Update the links using the new momenta, with stepsize r/2 — t\. 

This procedure is trivially reversible and it also preserves the integration measure (see 
Appendix). 

Including reflection and refraction into the update is a crucial point in the simulations. 
If the system does not notice the step in the action due to the finite stepsize integration, 
there will be huge jumps in the energy leading to a very bad acceptance ratio in the final 
accept /reject step. 

3. Numerical tests 

As it was discussed by many authors (and we also illustrated above) the dynamical hybrid 
Monte-Carlo for QCD with overlap fermions is computationally extremely intensive due 
to the nested inversion. 0(100) conjugate gradient steps is usually enough for Wilson or 
staggered fermions. In the overlap formalism one is confronted with O(100 2 ) matrix-vector 
multiplications. Therefore, with a straightforward hybrid Monte-Carlo and with present 
medium-size machines only absurdly small lattices can be studied. Nevertheless, these 
studies can show the feasibility of the algorithm and can be used for cross-checking future 
results. 

We studied our hybrid Monte-Carlo on V = 2 4 , 4 and on 4 • 6 3 lattices with mo = 1-6, 
with mass parameters m = 0.1, 0.2 and (3 between 5.3 and 6.1. The length of our trajectories 
were 1. We used At = 0.01 — 0.05 as time-steps for the molecular dynamical evolution. The 
Metropolis accept/reject step at the end of the trajectories resulted in a 30-80% acceptance 
rate. 

We used a 20th order rational polynomial approximation. This choice gives the sign 
function with a relative accuracy of 0(1O~ 5 ). Note, that changing the order of the approx- 
imation from 10 to 20 increased the computational effort only by 20%. 

In order to accelerate the inner inversion we projected out the eigenmodes with the 
smallest eigenvalues. The inversion was then performed in the orthogonal subspace. We 
studied the computational requirements as a function of the number of the projected eigen- 
modes. The projections were done by the ARPACK code. The optimum was found around 
20 eigenmodes. The projection leads to an important observation. The operator Hy/ might 
have rather small eigenvalues e.g. O(10 -6 ). In order to project out eigenmodes one has to 
solve eigenvalue equations. In these equations the sum of 0(1) numbers should result in 
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0(1CP 6 ). This is clearly beyond the accessible region of 32-bit arithmetics. Therefore, we 
used 64-bit precision. 

In addition to the standard consistency tests (reversibility of the trajectories and 
At 2 scaling of the action) we performed a brute force approach on 2 2 and 4 4 lattices. 
We generated quenched configurations, then we explicitely calculated the determinants of 
mo[l +75c(/iw)]- These determinants were used in an additional Metropolis accept/reject 
step. The hybrid Monte-Carlo results agree completely with those of the brute force ap- 
proach. 

Table § gives informations on our run parameters and test results. 



V 


(Am) 


number of trajectories 


Plaquette 


2 4 


(5.6,0.2) 


900 


3.44(1) 


4 4 


(5.4,0.2) 


1200 


2.572(4) 


4-6 3 


(5.78,0.1) 


400 


3.22(1) 



Table 2: Expectation values of the plaquette variable for different volumes and parameters. 




200 400 600 800 1000 5.4 5.6 5.8 6 

N $ 

Figure 1: The time history of the plaquette (left panel) on a 4 4 lattice at m = 0.2 and (3 = 5.4. 
The j3 dependence (right panel) of the Polyakov-loop on 4-6 3 lattices at m = 0.1. (Quenched results 
suggest that the pion mass could be around 200-250 MeV for our parameter choice at ,9=5.7) 



The left panel of Figure |l| shows the time history of the plaquette variable in one of our 
4 4 runs. We also observed changes between topological sectors. The ratio of topologically 
non-trivial and trivial gauge configurations was found to be about the same as in the brute 
force approach (it is around or below the percent level). On the right panel we present the (3 
dependence of the Polyakov-loop. These runs were obtained on 4 • 6 3 lattices [35]. One can 
see a sharp increase around (3 ~ 5.7. As usual, before drawing any physical interpretation 
one should proceed to the continuum limit (see also Ref. [36]). 
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4. Conclusions 



In this paper we described a hybrid Monte-Carlo algorithm for dynamical, nf=2, four- 
dimensional QCD with overlap fermions. We used a modified version of the MILC collab- 
oration's code. We started with the Zolotarev optimal rational polynomial approximation 
to numerically implement the sign function. Using the partional fraction expansion of the 
rational polynomial resulted in a particularly simple form for the fermionic force. The in- 
versions due to the fermion operator ( "outer inversion" ) and due to the rational polynomial 
denominator ("inner inversion") were done successively by conjugate gradient processes. 
It was possible to significantly accelerate the inner inversion by projecting out the lowest 
eigenmodes and to use a multi-shift solver for the different terms in the partional fraction 
expansion. 

We extended the standard leap-frog integration of the trajectories by including the 
refraction and reflection on the zero-eigenvalue surfaces of the Wilson-Dirac operator. The 
inclusion of these effects increased the acceptance rate of the algorithm significantly. 

We compared our hybrid Monte-Carlo results with those obtained by a brute force 
approach (quenched configurations with Metropolis accept/reject steps for the exactly cal- 
culated overlap determinant). A complete agreement was found for 2 4 and 4 4 lattices. 

When writing up this paper an independent hybrid Monte-Carlo was written [37]. We 
cross-checked the results for 4 4 lattices and a complete agreement was found. 
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Appendix 

In this appendix we examine the area-preserving property of the modified leap-frog 
procedure described in the text. 

Let us start with an example of the iV-dimensional Euclidean coordinate space which 
shows the basic idea of the proof in a transparent way. 

We solve the equations of motion with a finite stepsize integration of the following 
Hamiltonian: 

H = -p a Pa + S(sgnM{q)), 

where q a ,p a (a = 1 . . . N) are the coordinates and the momenta. M depends only on the 
coordinates and the action S is a smooth function (note that q a , M and S are analogous 
to the links, the fermion matrix and the fermionic action, respectively). The standard 
leap-frog algorithm can be effectively applied to this system, as long as the trajectories do 
not cross the zero-eigenvalue surface of M (X(q) = 0, where A(g) is the eigenvalue with 
smallest magnitude 2 ). 

We have to modify the leap-frog algorithm, when the coordinates reach the zero- 
eigenvalue surface. Instead of the original leap-frog update of the coordinates, where the 
constant p a momenta are used for the time r/2, we first update the coordinates with p a 
until the surface, then we change the momentum to p' a , which is used to evolve q a for the 
remaining time. In case of refraction one has the following phase space transformation: 



q' = q + T lP + (r/2 - r x )p' (4.1) 
p = p — n{np) + n{np') 



where n is the normalvector of the surface, AS is the potential jump along the surface, 
and {np') 2 = {np) 2 — 2AS. t\ is the time required to reach the surface with the incoming 
momenta p. The transformation for reflection is given by 

q' = q + rip + (r/2 - T X )p' (4.2) 
p' = p — 2n(np) 

In the following we will not deal with this case (one can obtain the Jacobian of reflection 
by simply setting {np') = —{np) in the Jacobian of refraction). 

First let us concentrate on the q,p dependence of t\. T±{q,p) is determined from the 
condition \{q + T\{q,p)p) = 0. One obtains the partial derivatives of t\ with respect to 
q,p by expanding this zero-eigenvalue condition to first order in 8q or 5p. First take the Sq 
variation: 



X{q a + npa + dq a + -^—dqbPa) = \{q + np) + — 
OQb Oq a 



q+Tip 



0t\ 

{5ab + -^—Pa)5qb = 
dq b 



2 We do not deal with the possibility of degenerate zero eigenvalues which appears only on a zero measure 
subset of the zero-eigenvalue surface. 
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Since the normalvector is just 



n n 



dq a 



q+rip 



I — I 



we have for the partial derivative of t\ with respect to q: 

n a 



dq a 



(np) 



Similarly one gets for the partial derivative with respect to p: 



dp a 



n„ 



-Ti 



[np) 



There is an important identity between the q and p derivatives of a function, which 
depends only on q + i~i(q,p)p. (Two examples are n and AS.) Let us evaluate p and q 
derivatives of an arbitrary g(q + T\{q,p)p) function: 



dg 
dq a 



dg 
dqb 



dg 
dp a 



dqb 



fx _l dTl \ d 9 

(dab + -^—Pb) = 

oq a oq b 
( x . dn dg 

(Tldab + T^—Pb) = 

dpa oq b 



q+rxp 



(4 

q+rip 

(Sab 



ab 



_ n a p b ^ 
(np) 

n a Pb s 
(np) ' 



which gives 



dg 
dp a 



dg 
dq a 



(4.3) 



Now we can consider the four different partial derivatives required for the Jacobian: 



J 



/dg' <9g / N 

I dq dp 

I V ¥ 

\ dq dp / 



whose determinant gives the change in the Euclidean measure d ' qd N p due to the given 
phase space transformation. Introducing 



Z, 



ab 



dqb ' 



one incorporates all terms which arise from the q dependence of the normalvector and AS*. 
In case of a straight wall with constant potential jump this matrix vanishes. (Clearly, for 
QCD with overlap fermions these objects are very hard to calculate; they usually require 
the diagonalization of the whole H\y matrix ). Using ( [4.3j ) the other three components of 
J can also be expressed with the help of Z. Denoting 



(np') 
(np) 



1: 



(np) 
(np') 



1 
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and 

Pab = S a b + xn a n b , 

we have 

P ab = 5 o,b + yn a n b - 

In terms of the P, P~ l and Z matrices the Jacobian is very simple. We can split it into 2 
parts: the first part is a matrix with determinant one and all Z factors are in the second 
term: 

J= (PriP+ (r/2 - njP-A ( (r/2 - n)Z (r/2 - n) n z\ 
" \0 P- 1 ) + \ Z T1 Z ) ' 

Let us introduce J' as the product of J and the inverse of its first term: 



J' = ( I ® i + e ® n pz, 




where E is defined as 



E 




E has an eigenvector v\ oc (n, —1) with zero eigenvalue. The V2 oc (1,ti) vector is orthog- 
onal to v\ and has the property to give zero in the product V2EV2 = 0. In the orthonormal 
basis given by v\ and V2 J' has the form: 



J' = ®i+ 1 ® nPZ, 





thus det J' = 1. Since J and J' differs only in a matrix with determinant one, we arrive 

det J = 1, 



thus the transformations ( |4.l| , [4.2; ) preserve the integration measure. 

The proof for the SU(3) case was carried out in a completely analogous way. The only 
difference was the appearance of factors associated with the group structure of SU (3) which 
all canceled out in the final result. Thus, we conclude that the suggested modification of 
the leap-frog conserves the integration measure. 
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